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Abstract 

A given density matrix may be represented in many ways as a mixture of 
pure states. We show how any density matrix may be realized as a uniform 
ensemble. It has been conjectured that one may realize all probability distri- 
butions that are majorized by the vector of eigenvalues of the density matrix. 
We show that if the states in the ensemble are assumed to be distinct then 
it is not true, but a marginally weaker statement may still be true. 
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1. Introduction 



A key property of quantum mechanics is that every mixed state, that is 
every non-pure density matrix, can be written as an ensemble of pure states 
in many ways. There exists a well known characterization of this property, 
apparently first published by Schrodinger [1] [2]: 

Theorem 1: A density matrix p having the diagonal form 

M 

p = ^2\ i \e i ){e i \ (1) 
i=i 

can be written in the form 

N N 

p = ^Pi\A){A\, Y,Pi = 1 ( 2 ) 
i=i i=i 

if and only if there exists a unitary N x N matrix U such that 

i M 

l^> = -^-E^V^I e i>- ( 3 ) 

VPi j=i 

Here all states are normalized to unit length but may not be orthogonal to 
each other. 

Observe that the matrix U does not act on the Hilbert space but on vectors 
whose components are state vectors, and also that we may well have N > M. 
But only the first M columns of U appear in the equation — the remaining 
N — M columns are just added in order to allow us to refer to the matrix U as 
a unitary matrix. What the theorem basically tells us is that the pure states 
\ipi) that make up an ensemble are linearly dependent on the M vectors |e») 
that make up the so called "eigenensemble" . Moreover an arbitrary state in 
that linear span can be included. For definiteness we assume from now on 
that all density matrices have rank M so that we consider ensembles of N 
pure states in an M dimensional Hilbert space. 

One can say a bit more. Recall that there is a notion called "majorization" 
that provides a natural partial preordering of probability distributions [3]. 

— * 

To be precise assume that (if necessary) the eigenvalue vector A has been 
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extended with zeroes until it has the same number N of components as 
p, and also that the entries in the probability vectors have been arranged 
in decreasing order. (In the sequel these assumptions will often be made 
tacitly.) By definition the distribution p is majorized by the distribution A, 

— * 

written p -< A, if and only if 

X><X> (4) 

i=i i=i 

for all k < N. In colloquial terms, the probability distribution p is "more 

— # 

even" than the distribution A. Now it is an easy consequence of Theorem 1 
that the probability vector that appears there is given by 

M 

Pi = J2 B a x j ; B ij = \u ij \ 2 . (5) 

The matrix B is bistochastic (all its matrix elements are positive and the sum 
of each row and each column is unity). This follows because by construction 
it is unistochastic, that is each matrix element is the absolute value squared 
of the corresponding element of a unitary matrix. All unistochastic matrices 
are bistochastic, but the converse is not true. One can now show: 

Theorem 2: Given a probability vector p there exists a set of pure states 
|^) such that eq. (2) holds if and only if p -< A, where A is the eigenvalue 
vector. 

In one direction this was shown by Uhlmann [4]: If such a decomposition 
exists then p -< A because all vectors that can be reached from a given vector 
with a bistochastic matrix are majorized by the given vector [3]. The converse 
was shown by Nielsen, who gave an algorithm for constructing the states 

— » 

given the vector p -< A [5]. We will return to his construction below. 

Why are these facts of interest? For one thing the components of p can 
arise as the squares of the coefficients in a Schmidt decomposition of a bi- 
partite entangled state (and the density matrix then appears as the state of 
a subsystem). These theorems then give insight into the different represen- 
tations that entangled states can be given. In particular Nielsen uses this 
insight to obtain a new protocol for the conversion of one entangled state to 
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another by means of local operations and classical communication (LOCC) 
[5]. As a general remark increasing ability to manipulate quantum states in 
the laboratory requires increasing precision in our understanding of how they 
can be represented mathematically. 

Now we can make a more precise statement which has not been proved: 

Conjecture 1: Given any probability vector p majorized by the eigenvalue 
vector A there exists a set of distinct pure states \ipi) such that eq. (2) holds. 

There is no guarantee that the algorithm offered by Nielsen leads to an 
ensemble of distinct states. In fact, as we will show, in general it does not— 
indeed Conjecture 1 is false. 

Our purpose is to formulate a new conjecture along the same lines that 
has a chance of being true. We begin in section 2 with a geometrical proof 
of a weak form of Conjecture 1, namely that any non-pure density matrix 
can be obtained as a uniform ensemble of pure states. We also explain why 
Conjecture 1 is in fact false. In section 3 we provide a review for physicists 
of the theory of majorization and bistochastic matrices. In section 4 we 
analyze the counterexamples to Conjecture 1 and collect some evidence for 
Conjecture 2, which will be a slight modification of the original. We fail to 
prove it though. Our conclusions are summarized in section 5. 



2. Uniform ensembles 



We want to construct a uniform ensemble (with all the pi equal to 1/N) for 
an arbitrary quantum state. Provided that the state is not pure such an en- 
semble can be constructed with very little ado. Let p = diag(Ai, A2, ... , Am)- 
Choose a pure state vector whose entries are the square roots of the eigen- 
values of p. Then form the one parameter family of state vectors 



Z a {r) 



V 










in-2.T 



\ 



1 




(6) 





e in - iT 

Here we choose M = 3 for illustrative purposes and the notation anticipates 
the fact that we will choose the n; to be integers. Rewrite these state vectors 
as projectors, 
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Figure 1: To the left we see a uniform distribution with N = 4 for the density- 
matrix p = diag(3/4, 1/4) in the Bloch ball and in the middle the degenerate 
uniform distribution that one would obtain from Nielsen's procedure. By studying 
the rightmost picture one can convince oneself that if p\ = \\ then any ensemble 
with N > 2 pure states must be degenerate. 



>/AlA7e inaiT A 2 yO^\;e m ^ T j . (7) 



where Form a uniform ensemble of pure states by 

p> = i- ^ drZ a (r)Z p (r) . (8) 

Clearly if we choose the rij such that all the are non-zero integers we 
will get p' = p, as was our aim. In geometrical terms, what we are doing 
is to represent our density matrix as a uniform distribution on a suitable 
closed Killing line, that is a flowline of a unitary transformation that leaves 
the original density matrix invariant. It is also clear that we can get a finite 
distribution by placing iV points on the closed curve parametrized by r, using 
the roots of unity. Finally it is clear that the argument works for all M. We 
illustrate the case M = 2 in fig. 1. Note that if we regard the space of 
density matrices equipped with the Hilbert-Schmidt metric as a subset of a 
flat Euclidean space then the closed curve is not a circle in that flat space 
except when M — 2, but then this is not needed for the argument. 

Clearly there are many ways to realize a uniform ensemble. Nielsen [5] 
provides a different procedure that relies on the theorem by Horn [6], dis- 



5 



cussed in the next section. Horn's theorem tells us how to construct a matrix 
U that obeys eq. (5) whenever p is majorized by A. This matrix is then used 
in eq. (3). But there is no guarantee that the states are distinct. Ex- 
plicit calculation shows that generically they are not, except when N = M. 
What this shows is that Theorem 1 must be used with some care. While the 
rows of a unitary matrix are never equal, it may still be true that the first 
M components of a pair of rows in a unitary N x N matrix coincide. If this 
happens two of the pure states in the decomposition (2) will coincide too, 
and the ensemble will in fact not be uniform (see fig. 1). An analogous dif- 
ficulty affects non-uniform ensembles as well. More details will be provided 
in section 4, once we have sketched some relevant background. 

Further inspection of the Bloch ball reveals that Conjecture 1 cannot be 
true in general. It obviously fails for pure states. A more interesting coun- 
terexample is the following: Let (Ai, A 2 ) = (1/2, 1/2). This is the maximally 
mixed state. Let (pi,P2,P3) = (§, \ , \)- Clearly p -< A. But it is geometrically 
evident that an ensemble of pure states with these pi as probabilities cannot 
give the maximally mixed state: Our three states define a plane through the 
Bloch ball that has to go through the center of the ball (where the maxi- 
mally mixed state sits). The convex sum of the two states with probability 
1/4 lies inside the ball and the density matrix must lie on the straight line 
between that point and that of the state with probability 1/2. But then the 
density matrix cannot lie in the center of the ball as stated. An extension 
of this argument shows that when M = 2 it is always impossible to realize a 
non-degenerate ensemble with pi = Ai and N > 2. See fig. 1. 

3. Majorization and bistochastic matrices. 

To bring the issues into focus a review of the mathematical background is 
called for. Majorization, as defined in the introduction, provides a natural 
partial preordering of vectors, and in particular of discrete probability distri- 
butions (positive vectors with trace norm equal to one). It is a preordering 
because p < q and q -< p does not imply p = q, only that the vector q is ob- 
tained by permuting the components of p. The notion is important in many 
contexts, ranging from economics to LOCC (Local Operations and Classical 
Communication) of entangled states in quantum mechanics [7]. 
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Figure 2: The probability simplex for N = 3. To the left we see the set of vectors 
majorized by a given vector sitting in the corner of the shaded polytope. To the 
right we see how to get from one probability vector to another by means of two 
T-transforms. 

The set of probability vectors is a convex simplex, and the set of such 
vectors that are majorized by a given vector forms a convex polytope with 
its corners at the N\ vectors obtained by permuting the N components of 
the given vector. It is helpful to keep the simplest example in mind. Let the 
number of components be N = 3. Then the set forms a triangle, and the set 
of vectors majorized by a given vector is easily recognized (see fig. 2). 

A basic fact about majorization is a theorem due to Hardy, Littlewood 
and Polya, that states that p -< q if and only if there exists a bistochastic 
matrix B such that p = Bq. A stochastic matrix is a matrix with non- 
negative entries such that the elements in each column sum to unity, which 
means that the matrix transforms probability vectors to probability vectors. 
It is bistochastic if also its rows sum to unity, which means that the uniform 
distribution e = -^(1,1, ... ,1) is a fixed point of the map. According to 
Birkhoff 's theorem the space of bistochastic by matrices is an (N — l) 2 
dimensional convex polytope with the AH permutation matrices making up 
its corners. In the center of the polytope we find the van der Waerden matrix 

all of whose entries are equal to 1/N. 

Some special cases of bistochastic matrices will be of interest below. A T- 
transform is a bistochastic matrix that acts non-trivially only on two entries 
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Figure 3: A two dimensional slice through Birkhoff's polytope; the shaded region 
consists of unistochastic matrices and the dot in the center represents the van 
der Waerden matrix. Some interesting observations are made about it in ref. [9], 
where a closely related picture is drawn. 



of the vectors. By means of permutations it can therefore be brought to the 
form 



T 



ft 1 -t 
l-t t 
1 



V 



o 







\ 





< t < 1 



(9) 



Given two vectors p -< q there always exists a sequence of iV — 1 T-transforms 
such that p = T N _i ... T x q. On the other hand (except when N — 2) it is 
not true that every bistochastic matrix can be written as a sequence of T- 
transforms. Fig. 2 shows how T-transforms act when N — 3. 

Unistochastic matrices were defined in the introduction. An orthostochas- 
tic matrix is a special case of that where the matrix elements of the bistochas- 
tic matrix are given by squares of the corresponding element of an orthogonal 
matrix. Horn's theorem [6] states that given p -< q one can always find an 
orthostochastic matrix B such that p = Bq. The proof is by induction and 
actually gives a construction of B as a sequence of — 1 T-transforms. 
The set of unistochastic matrices form a compact connected subset of the 
set of bistochastic matrices. When N > 2 not all bistochastic matrices are 
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unistochastic (see fig. 3). The van der Waerden matrix is unistochastic. It 
will be of interest below to know that sequences of T-transforms are always 
unistochastic when N — 3, but that this is not so when N > 3 [8]. For a 
more extensive discussion of unistochastic matrices we recommend reference 



4. Conjectures. 

We can now return to the question of precisely which probability vectors p 
that can occur in non-degenerate ensembles for a density matrix with eigen- 
value vector A. We know that p -< A. Nielsen's idea [5] is to rely on Horn's 
theorem to provide an orthostochastic matrix B connecting the two vectors. 
The catch — from our point of view — is that the resulting ensemble will be 
degenerate in generic cases. 

In section 2 we pointed out a class of counterexamples to Conjecture 1 
for two dimensional Hilbert spaces. We can now see in a different way how 
they arise for arbitrary dimension N . Let 



and assume A&_i > A&. It is then easy to show that any bistochastic matrix 
connecting the two vectors must take the block diagonal form 



where D\ and D 2 are bistochastic matrices in themselves. Di is a (A; — 
1) x {k — 1) matrix. If k — M then the form of B means that only one 
column of D 2 is actually used in constructing no less than N — M + 1 of the 
pure states in eq. (3), so that all these state vectors are contained in a one 
dimensional subspace. Hence the ensemble must have at least that degree of 
degeneracy, for essentially the same reason that a pure density matrix leads 
to a totally degenerate ensemble. It is tempting to guess that this is the only 
kind of counterexample to Conjecture 1 that can arise. Note however that 
the geometric argument in section 2 was actually a little more general as far 
as the case M = 2 is concerned, and excludes also the case \m-i = A a/. 



[9]. 



Pi + ... +p fe _i = Ai+ ... +A fc _i 



(10) 




(11) 
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When k < M in eq. (10) there is no obvious reason why the block 
diagonal form of B must lead to degeneracies, and in fact this is not so in the 
(few) examples that we checked. In the concluding section we will conjecture 
that we have already found all the necessary additional restrictions that are 
missing from Conjecture 1. 

Let us approach the problem from another direction. Given p -< A can 
we find an algorithm for how to construct a unistochastic matrix B such 
that p = B\ and such that the corresponding unitary matrix leads to a non- 
degenerate ensemble? To begin with let us assume that p = e and let us ask 
for a sequence of T-transforms that produces the "natural" uniform ensemble 
presented in section 2. An algorithm that does this is to first apply a T- 
transform T\ with t — 1/2 to the first two entries in p, then a T-transform 
T 2 with t = 1/2 to the second and third entries, and so on until T N _i sets 
the last two components of the vector equal. Then we repeat the procedure 
an infinite number of times. We get 



lim {T N ^T N _ 2 ... T{) n = 



1 

N 



( 1 


1 


1 • 


■ 1 \ 


1 


1 


1 • 


• 1 


1 


1 


1 • 


• 1 


V 1 


1 


1 • 


• 1/ 



(12) 



This is the van der Waerden matrix B* which is unistochastic and when used 
in Schrodinger's theorem does in fact lead to the natural uniform ensemble 
from section 2. To see that the sequence converges to B* we simply observe 
that the T-transforms do not depend on the vector p that we start out with. 
Therefore we can read off the columns of B* by seeing how it acts on the 
corners of the probability simplex, that is the vectors (1,0, ... ,0) and so on. 

Clearly this algorithm can be generalized to arbitrary p and A. The 
key idea is to choose, at each step, a T-transform that ensures the equality 
Pk/Pk-i = Afc/Afc_i. Typically this will again converge to a definite bistochas- 
tic matrix B in an infinite number of steps, this time because the individual 
T-transforms approach the unit matrix. In the examples that we checked 
(mostly for three-by-three matrices) it does produce a non-degenerate en- 
semble except for the counterexamples we already have. We therefore have 
a candidate for a constructive algorithm with the desired properties. 

Unfortunately we do not know if the candidate is good enough. We do 
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not know that it always results in a unistochastic matrix, let alone a non- 
degenerate ensemble. Already the first question becomes non-trivial when 
N > 3, as noted in section 3. What we do know is that a sequence of T- 
transforms always meets the "chain-links" conditions from ref. [10] (see also 
[9]). These are necessary but not sufficient conditions that a bistochastic 
matrix is unistochastic. The idea is as follows: Take a bistochastic matrix 



Form the "links" Li = \fa^i. If B is unistochastic it must be true that 



These are called the chain-links conditions (stated in terms of columns) be- 
cause they make it possible to form a "chain" (in this case a triangle) out of 
the links. This in turn ensures that a set of phases /ii, /i 2 can be found such 
that the matrix 



is unitary. (It is unnecessary to check the last column.) For three by three 
matrices the chain-links conditions are sufficient. For N by N matrices with 
N > 3 the story becomes more complicated. The chain-links conditions 
still state that no one of the lengths (constructed analogously) can be larger 
than the sum of all the others. When one tries to construct the unitary 
matrix the number of equations to solve is the same as the number of phases 
available, but it can happen that the equations have no solution. Therefore 
the chain-links conditions are necessary but not sufficient when N > 3. 

Lemma: A sequence of T-transforms always obeys the chain-links condi- 
tions. 

Sketch of proof: A single T-transform is unistochastic. Consider any se- 
quence of T-transforms. Suppose that the first n of these form a matrix B 




(13) 



Li < L 2 + L 3 , L 2 < L 3 + Li and L 3 < L x + L 2 ■ 



(14) 



( 




(15) 
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that obeys the chain-links conditions. It is enough to prove that this im- 
plies that TB obeys the chain-links conditions, where T is any T-transform. 
(Since the set of matrices that obey these conditions is compact, infinite 
sequences pose no particular problem.) Consider therefore 

/ 1 o o w ai h ■ \ 

T(t)B = t 1 - t a 2 b 2 ■ , < t < 1 . (16) 
V 1 - 1 t J \ a 3 b 3 ■ J 

With the links L^t) formed from the matrix T(t)B as indicated above, we 
must show that 

Hi) <l 2 (i) + l 3 (i) Ht) < Ht) + L 3 (t) . (17) 

(Note that T(1)B = B which obeys the conditions by assumption.) Since 
Li(t) is constant it is enough to verify that the function L 2 (t) + L 3 (t) is larger 
than L 2 (l) + L 3 (t) everywhere in the interval. We observe that this function 
is symmetric around t — 1/2 and a straightforward calculation verifies that 
it assumes its only extremum there, and that this extremum is a maximum. 
The proof that the other chain-links conditions hold is similar. Extension of 
the proof to larger matrices and to the chain-links condition stated in terms 
of rows rather than columns is also straightforward. 

For three- by-three matrices this establishes the result of ref. [8] , that any 
sequence of T-transforms is unistochastic. For larger matrices the chain-links 
condition is necessary but not sufficient for that and there do exist sequences 
of T-transforms that are not unistochastic. It remains possible that the 
particular kind of sequence that we propose as an algorithm to realize an 
ensemble with probability vector p always results in unistochastic matrices 
but we have failed to prove this. Assuming that this can be done we would 
still have to prove that the ensemble that results from the unitary matrix 
so constructed is non-degenerate for all allowed probability vectors. This 
appears to be significantly more difficult. 

5. Conclusions. 

We have studied the question of precisely what kind of discrete probability 
distributions that can appear in an ensemble of pure states that describe a 
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given density matrix. Evidently our main conclusion is that this question 
has more facets to it than one might suspect based on earlier literature [5]. 
We think that one answer is the following: 

Conjecture 2: Given any probability vector p majorized by the eigenvalue 
vector A. Assume that the density matrix is not pure. Then there exists a 
set of distinct pure states \ipi) such that eq. (2) holds if and only if p 1 + ... + 
Pm-i 7^ + ••• + -Vm-i- 

The "only if" part of the statement is proved for M — 2. For M > 2 the 
case Am-i = Am may need separate attention, otherwise the "only if" part 
is again proved. The "if" part is only weakly supported by arguments and 
examples. 

We also made a suggestion for how, given a p consistent with Conjecture 
2, one might go about to construct such an ensemble by means of a sequence 
of T-transforms, but this suggestion is very weakly supported. Essentially 
the only argument is that the procedure does give the geometrically natural 
uniform ensemble when p—e. 
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